Overview

Dataset statistics

Number of variables23
Number of observations41188
Missing cells38364
Missing cells (%)4.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory7.2 MiB
Average record size in memory184.0 B

Variable types

Numeric11
Categorical8
Boolean4

Alerts

i1 is highly correlated with i2 and 2 other fieldsHigh correlation
i2 is highly correlated with i1High correlation
i4 is highly correlated with i1 and 1 other fieldsHigh correlation
i5 is highly correlated with i1 and 1 other fieldsHigh correlation
n4 is highly correlated with n6High correlation
n6 is highly correlated with n4High correlation
i1 is highly correlated with i2 and 2 other fieldsHigh correlation
i2 is highly correlated with i1 and 2 other fieldsHigh correlation
i4 is highly correlated with i1 and 2 other fieldsHigh correlation
i5 is highly correlated with i1 and 3 other fieldsHigh correlation
n4 is highly correlated with n6High correlation
n6 is highly correlated with i5 and 1 other fieldsHigh correlation
i1 is highly correlated with i2 and 2 other fieldsHigh correlation
i2 is highly correlated with i1High correlation
i4 is highly correlated with i1 and 1 other fieldsHigh correlation
i5 is highly correlated with i1 and 1 other fieldsHigh correlation
n4 is highly correlated with n6High correlation
n6 is highly correlated with n4High correlation
successful_sell is highly correlated with c10High correlation
month is highly correlated with c4High correlation
c4 is highly correlated with monthHigh correlation
c10 is highly correlated with successful_sellHigh correlation
age is highly correlated with employmentHigh correlation
c10 is highly correlated with c8 and 4 other fieldsHigh correlation
c4 is highly correlated with i1 and 5 other fieldsHigh correlation
c8 is highly correlated with c10 and 5 other fieldsHigh correlation
employment is highly correlated with age and 1 other fieldsHigh correlation
i1 is highly correlated with c4 and 5 other fieldsHigh correlation
i2 is highly correlated with c4 and 6 other fieldsHigh correlation
i3 is highly correlated with c10 and 9 other fieldsHigh correlation
i4 is highly correlated with c10 and 10 other fieldsHigh correlation
i5 is highly correlated with c10 and 8 other fieldsHigh correlation
month is highly correlated with c4 and 5 other fieldsHigh correlation
n4 is highly correlated with c8 and 4 other fieldsHigh correlation
n6 is highly correlated with i4 and 1 other fieldsHigh correlation
school is highly correlated with employmentHigh correlation
successful_sell is highly correlated with c10 and 4 other fieldsHigh correlation
b2 has 990 (2.4%) missing values Missing
c8 has 35563 (86.3%) missing values Missing
school has 1731 (4.2%) missing values Missing
n5 has unique values Unique
n6 has 35563 (86.3%) zeros Zeros

Reproduction

Analysis started2021-10-04 02:28:47.693261
Analysis finished2021-10-04 02:29:11.159366
Duration23.47 seconds
Software versionpandas-profiling v3.1.0
Download configurationconfig.json

Variables

age
Real number (ℝ≥0)

HIGH CORRELATION

Distinct78
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.02406041
Minimum17
Maximum98
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:11.285079image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum17
5-th percentile26
Q132
median38
Q347
95-th percentile58
Maximum98
Range81
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.42124998
Coefficient of variation (CV)0.2603746315
Kurtosis0.7913115312
Mean40.02406041
Median Absolute Deviation (MAD)7
Skewness0.7846968158
Sum1648511
Variance108.6024512
MonotonicityNot monotonic
2021-10-03T21:29:11.437038image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
311947
 
4.7%
321846
 
4.5%
331833
 
4.5%
361780
 
4.3%
351759
 
4.3%
341745
 
4.2%
301714
 
4.2%
371475
 
3.6%
291453
 
3.5%
391432
 
3.5%
Other values (68)24204
58.8%
ValueCountFrequency (%)
175
 
< 0.1%
1828
 
0.1%
1942
 
0.1%
2065
 
0.2%
21102
 
0.2%
22137
 
0.3%
23226
 
0.5%
24463
1.1%
25598
1.5%
26698
1.7%
ValueCountFrequency (%)
982
 
< 0.1%
951
 
< 0.1%
941
 
< 0.1%
924
 
< 0.1%
912
 
< 0.1%
892
 
< 0.1%
8822
0.1%
871
 
< 0.1%
868
 
< 0.1%
8515
< 0.1%

b1
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
yes
21576 
no
18622 
-1
 
990

Length

Max length3
Median length3
Mean length2.523841896
Min length2

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowyes
2nd rowyes
3rd rowno
4th rowyes
5th rowno

Common Values

ValueCountFrequency (%)
yes21576
52.4%
no18622
45.2%
-1990
 
2.4%

Length

2021-10-03T21:29:11.577795image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:11.652062image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
yes21576
52.4%
no18622
45.2%
1990
 
2.4%

b2
Boolean

MISSING

Distinct2
Distinct (%)< 0.1%
Missing990
Missing (%)2.4%
Memory size80.6 KiB
False
33950 
True
6248 
(Missing)
 
990
ValueCountFrequency (%)
False33950
82.4%
True6248
 
15.2%
(Missing)990
 
2.4%
2021-10-03T21:29:11.691529image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

c10
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
False
36548 
True
4640 
ValueCountFrequency (%)
False36548
88.7%
True4640
 
11.3%
2021-10-03T21:29:11.724723image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

c3
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
False
32588 
unknown
8597 
True
 
3

Length

Max length7
Median length5
Mean length5.417378848
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowFalse
2nd rowFalse
3rd rowunknown
4th rowFalse
5th rowunknown

Common Values

ValueCountFrequency (%)
False32588
79.1%
unknown8597
 
20.9%
True3
 
< 0.1%

Length

2021-10-03T21:29:11.798476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:11.874673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
false32588
79.1%
unknown8597
 
20.9%
true3
 
< 0.1%

c4
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
new
26144 
old
15044 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownew
2nd rownew
3rd rownew
4th rownew
5th rownew

Common Values

ValueCountFrequency (%)
new26144
63.5%
old15044
36.5%

Length

2021-10-03T21:29:11.948002image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:12.028790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
new26144
63.5%
old15044
36.5%

c8
Boolean

HIGH CORRELATION
MISSING

Distinct2
Distinct (%)< 0.1%
Missing35563
Missing (%)86.3%
Memory size80.6 KiB
False
4252 
True
 
1373
(Missing)
35563 
ValueCountFrequency (%)
False4252
 
10.3%
True1373
 
3.3%
(Missing)35563
86.3%
2021-10-03T21:29:12.077069image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

dow
Categorical

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
thu
8623 
mon
8514 
wed
8134 
tue
8090 
fri
7827 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfri
2nd rowthu
3rd rowtue
4th rowmon
5th rowtue

Common Values

ValueCountFrequency (%)
thu8623
20.9%
mon8514
20.7%
wed8134
19.7%
tue8090
19.6%
fri7827
19.0%

Length

2021-10-03T21:29:12.145451image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:12.211536image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
thu8623
20.9%
mon8514
20.7%
wed8134
19.7%
tue8090
19.6%
fri7827
19.0%

employment
Categorical

HIGH CORRELATION

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
assistant
10422 
laborer
9254 
engineer
6743 
customer service
3969 
management
2924 
Other values (7)
7876 

Length

Max length16
Median length8
Mean length8.918519957
Min length4

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowmanagement
2nd rowassistant
3rd rowleisure
4th rowassistant
5th rowassistant

Common Values

ValueCountFrequency (%)
assistant10422
25.3%
laborer9254
22.5%
engineer6743
16.4%
customer service3969
 
9.6%
management2924
 
7.1%
leisure1720
 
4.2%
hobbyist1456
 
3.5%
self-employed1421
 
3.5%
cleaner1060
 
2.6%
none1014
 
2.5%
Other values (2)1205
 
2.9%

Length

2021-10-03T21:29:12.304415image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
assistant10422
23.1%
laborer9254
20.5%
engineer6743
14.9%
customer3969
 
8.8%
service3969
 
8.8%
management2924
 
6.5%
leisure1720
 
3.8%
hobbyist1456
 
3.2%
self-employed1421
 
3.1%
cleaner1060
 
2.3%
Other values (3)2219
 
4.9%

i1
Real number (ℝ)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.08188550063
Minimum-3.4
Maximum1.4
Zeros0
Zeros (%)0.0%
Negative17191
Negative (%)41.7%
Memory size321.9 KiB
2021-10-03T21:29:12.403957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-3.4
5-th percentile-2.9
Q1-1.8
median1.1
Q31.4
95-th percentile1.4
Maximum1.4
Range4.8
Interquartile range (IQR)3.2

Descriptive statistics

Standard deviation1.570959741
Coefficient of variation (CV)19.18483405
Kurtosis-1.062631525
Mean0.08188550063
Median Absolute Deviation (MAD)0.3
Skewness-0.7240955492
Sum3372.7
Variance2.467914506
MonotonicityNot monotonic
2021-10-03T21:29:12.505475image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=10)
ValueCountFrequency (%)
1.416234
39.4%
-1.89184
22.3%
1.17763
18.8%
-0.13683
 
8.9%
-2.91663
 
4.0%
-3.41071
 
2.6%
-1.7773
 
1.9%
-1.1635
 
1.5%
-3172
 
0.4%
-0.210
 
< 0.1%
ValueCountFrequency (%)
-3.41071
 
2.6%
-3172
 
0.4%
-2.91663
 
4.0%
-1.89184
22.3%
-1.7773
 
1.9%
-1.1635
 
1.5%
-0.210
 
< 0.1%
-0.13683
 
8.9%
1.17763
18.8%
1.416234
39.4%
ValueCountFrequency (%)
1.416234
39.4%
1.17763
18.8%
-0.13683
 
8.9%
-0.210
 
< 0.1%
-1.1635
 
1.5%
-1.7773
 
1.9%
-1.89184
22.3%
-2.91663
 
4.0%
-3172
 
0.4%
-3.41071
 
2.6%

i2
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean93.57566437
Minimum92.201
Maximum94.767
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:12.605332image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum92.201
5-th percentile92.713
Q193.075
median93.749
Q393.994
95-th percentile94.465
Maximum94.767
Range2.566
Interquartile range (IQR)0.919

Descriptive statistics

Standard deviation0.578840049
Coefficient of variation (CV)0.00618579684
Kurtosis-0.8298085772
Mean93.57566437
Median Absolute Deviation (MAD)0.38
Skewness-0.2308876514
Sum3854194.464
Variance0.3350558023
MonotonicityNot monotonic
2021-10-03T21:29:12.738792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
93.9947763
18.8%
93.9186685
16.2%
92.8935794
14.1%
93.4445175
12.6%
94.4654374
10.6%
93.23616
8.8%
93.0752458
 
6.0%
92.201770
 
1.9%
92.963715
 
1.7%
92.431447
 
1.1%
Other values (16)3391
8.2%
ValueCountFrequency (%)
92.201770
 
1.9%
92.379267
 
0.6%
92.431447
 
1.1%
92.469178
 
0.4%
92.649357
 
0.9%
92.713172
 
0.4%
92.75610
 
< 0.1%
92.843282
 
0.7%
92.8935794
14.1%
92.963715
 
1.7%
ValueCountFrequency (%)
94.767128
 
0.3%
94.601204
 
0.5%
94.4654374
10.6%
94.215311
 
0.8%
94.199303
 
0.7%
94.055229
 
0.6%
94.027233
 
0.6%
93.9947763
18.8%
93.9186685
16.2%
93.876212
 
0.5%

i3
Real number (ℝ)

HIGH CORRELATION

Distinct26
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-40.50260027
Minimum-50.8
Maximum-26.9
Zeros0
Zeros (%)0.0%
Negative41188
Negative (%)100.0%
Memory size321.9 KiB
2021-10-03T21:29:12.909236image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-50.8
5-th percentile-47.1
Q1-42.7
median-41.8
Q3-36.4
95-th percentile-33.6
Maximum-26.9
Range23.9
Interquartile range (IQR)6.3

Descriptive statistics

Standard deviation4.628197856
Coefficient of variation (CV)-0.1142691537
Kurtosis-0.3585583105
Mean-40.50260027
Median Absolute Deviation (MAD)4.4
Skewness0.3031798587
Sum-1668221.1
Variance21.4202154
MonotonicityNot monotonic
2021-10-03T21:29:13.020243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
-36.47763
18.8%
-42.76685
16.2%
-46.25794
14.1%
-36.15175
12.6%
-41.84374
10.6%
-423616
8.8%
-47.12458
 
6.0%
-31.4770
 
1.9%
-40.8715
 
1.7%
-26.9447
 
1.1%
Other values (16)3391
8.2%
ValueCountFrequency (%)
-50.8128
 
0.3%
-50282
 
0.7%
-49.5204
 
0.5%
-47.12458
 
6.0%
-46.25794
14.1%
-45.910
 
< 0.1%
-42.76685
16.2%
-423616
8.8%
-41.84374
10.6%
-40.8715
 
1.7%
ValueCountFrequency (%)
-26.9447
 
1.1%
-29.8267
 
0.6%
-30.1357
 
0.9%
-31.4770
 
1.9%
-33172
 
0.4%
-33.6178
 
0.4%
-34.6174
 
0.4%
-34.8264
 
0.6%
-36.15175
12.6%
-36.47763
18.8%

i4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct316
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.621290813
Minimum0.634
Maximum5.045
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:13.138239image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0.634
5-th percentile0.797
Q11.344
median4.857
Q34.961
95-th percentile4.966
Maximum5.045
Range4.411
Interquartile range (IQR)3.617

Descriptive statistics

Standard deviation1.734447405
Coefficient of variation (CV)0.4789583313
Kurtosis-1.406802622
Mean3.621290813
Median Absolute Deviation (MAD)0.108
Skewness-0.7091879564
Sum149153.726
Variance3.0083078
MonotonicityNot monotonic
2021-10-03T21:29:13.271463image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4.8572868
 
7.0%
4.9622613
 
6.3%
4.9632487
 
6.0%
4.9611902
 
4.6%
4.8561210
 
2.9%
4.9641175
 
2.9%
1.4051169
 
2.8%
4.9651071
 
2.6%
4.8641044
 
2.5%
4.961013
 
2.5%
Other values (306)24636
59.8%
ValueCountFrequency (%)
0.6348
 
< 0.1%
0.63543
0.1%
0.63614
 
< 0.1%
0.6376
 
< 0.1%
0.6387
 
< 0.1%
0.63916
 
< 0.1%
0.6410
 
< 0.1%
0.64235
0.1%
0.64323
0.1%
0.64438
0.1%
ValueCountFrequency (%)
5.0459
 
< 0.1%
57
 
< 0.1%
4.97172
 
0.4%
4.968992
 
2.4%
4.967643
 
1.6%
4.966622
 
1.5%
4.9651071
2.6%
4.9641175
2.9%
4.9632487
6.0%
4.9622613
6.3%

i5
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct11
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5167.035911
Minimum4963.6
Maximum5228.1
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:13.581714image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum4963.6
5-th percentile5017.5
Q15099.1
median5191
Q35228.1
95-th percentile5228.1
Maximum5228.1
Range264.5
Interquartile range (IQR)129

Descriptive statistics

Standard deviation72.25152767
Coefficient of variation (CV)0.01398316732
Kurtosis-0.003760375696
Mean5167.035911
Median Absolute Deviation (MAD)37.1
Skewness-1.044262407
Sum212819875.1
Variance5220.28325
MonotonicityNot monotonic
2021-10-03T21:29:13.680464image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=11)
ValueCountFrequency (%)
5228.116234
39.4%
5099.18534
20.7%
51917763
18.8%
5195.83683
 
8.9%
5076.21663
 
4.0%
5017.51071
 
2.6%
4991.6773
 
1.9%
5008.7650
 
1.6%
4963.6635
 
1.5%
5023.5172
 
0.4%
ValueCountFrequency (%)
4963.6635
 
1.5%
4991.6773
 
1.9%
5008.7650
 
1.6%
5017.51071
 
2.6%
5023.5172
 
0.4%
5076.21663
 
4.0%
5099.18534
20.7%
5176.310
 
< 0.1%
51917763
18.8%
5195.83683
8.9%
ValueCountFrequency (%)
5228.116234
39.4%
5195.83683
 
8.9%
51917763
18.8%
5176.310
 
< 0.1%
5099.18534
20.7%
5076.21663
 
4.0%
5023.5172
 
0.4%
5017.51071
 
2.6%
5008.7650
 
1.6%
4991.6773
 
1.9%

marriage-status
Categorical

Distinct3
Distinct (%)< 0.1%
Missing80
Missing (%)0.2%
Memory size321.9 KiB
married
24928 
single
11568 
divorced
4612 

Length

Max length8
Median length7
Mean length6.830787195
Min length6

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowdivorced
2nd rowdivorced
3rd rowmarried
4th rowmarried
5th rowmarried

Common Values

ValueCountFrequency (%)
married24928
60.5%
single11568
28.1%
divorced4612
 
11.2%
(Missing)80
 
0.2%

Length

2021-10-03T21:29:13.785923image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:13.867141image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
married24928
60.6%
single11568
28.1%
divorced4612
 
11.2%

month
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size321.9 KiB
may
13769 
jul
7174 
aug
6178 
jun
5318 
nov
4101 
Other values (5)
4648 

Length

Max length3
Median length3
Mean length3
Min length3

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowapr
2nd rowmay
3rd rowjul
4th rownov
5th rowjul

Common Values

ValueCountFrequency (%)
may13769
33.4%
jul7174
17.4%
aug6178
15.0%
jun5318
 
12.9%
nov4101
 
10.0%
apr2632
 
6.4%
oct718
 
1.7%
sep570
 
1.4%
mar546
 
1.3%
dec182
 
0.4%

Length

2021-10-03T21:29:13.958050image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:14.037415image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
may13769
33.4%
jul7174
17.4%
aug6178
15.0%
jun5318
 
12.9%
nov4101
 
10.0%
apr2632
 
6.4%
oct718
 
1.7%
sep570
 
1.4%
mar546
 
1.3%
dec182
 
0.4%

n2
Real number (ℝ≥0)

Distinct42
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.567592503
Minimum1
Maximum56
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:14.166408image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile7
Maximum56
Range55
Interquartile range (IQR)2

Descriptive statistics

Standard deviation2.770013543
Coefficient of variation (CV)1.078836903
Kurtosis36.97979514
Mean2.567592503
Median Absolute Deviation (MAD)1
Skewness4.762506697
Sum105754
Variance7.672975028
MonotonicityNot monotonic
2021-10-03T21:29:14.295739image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=42)
ValueCountFrequency (%)
117642
42.8%
210570
25.7%
35341
 
13.0%
42651
 
6.4%
51599
 
3.9%
6979
 
2.4%
7629
 
1.5%
8400
 
1.0%
9283
 
0.7%
10225
 
0.5%
Other values (32)869
 
2.1%
ValueCountFrequency (%)
117642
42.8%
210570
25.7%
35341
 
13.0%
42651
 
6.4%
51599
 
3.9%
6979
 
2.4%
7629
 
1.5%
8400
 
1.0%
9283
 
0.7%
10225
 
0.5%
ValueCountFrequency (%)
561
 
< 0.1%
432
 
< 0.1%
422
 
< 0.1%
411
 
< 0.1%
402
 
< 0.1%
391
 
< 0.1%
371
 
< 0.1%
355
< 0.1%
343
< 0.1%
334
< 0.1%

n3
Real number (ℝ≥0)

Distinct50
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean745.1420317
Minimum500
Maximum990
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:14.420809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum500
5-th percentile520
Q1620
median750
Q3870
95-th percentile970
Maximum990
Range490
Interquartile range (IQR)250

Descriptive statistics

Standard deviation144.2461959
Coefficient of variation (CV)0.1935821492
Kurtosis-1.201793581
Mean745.1420317
Median Absolute Deviation (MAD)120
Skewness0.002266545452
Sum30690910
Variance20806.96504
MonotonicityNot monotonic
2021-10-03T21:29:14.550752image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
770913
 
2.2%
660902
 
2.2%
630873
 
2.1%
980866
 
2.1%
940862
 
2.1%
680862
 
2.1%
610858
 
2.1%
500855
 
2.1%
700850
 
2.1%
920845
 
2.1%
Other values (40)32502
78.9%
ValueCountFrequency (%)
500855
2.1%
510785
1.9%
520812
2.0%
530831
2.0%
540788
1.9%
550839
2.0%
560791
1.9%
570798
1.9%
580831
2.0%
590829
2.0%
ValueCountFrequency (%)
990781
1.9%
980866
2.1%
970809
2.0%
960825
2.0%
950830
2.0%
940862
2.1%
930828
2.0%
920845
2.1%
910842
2.0%
900838
2.0%

n4
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION

Distinct27
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean962.475454
Minimum0
Maximum999
Zeros15
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:14.673671image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile999
Q1999
median999
Q3999
95-th percentile999
Maximum999
Range999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation186.9109073
Coefficient of variation (CV)0.194198103
Kurtosis22.22946263
Mean962.475454
Median Absolute Deviation (MAD)0
Skewness-4.922189916
Sum39642439
Variance34935.68728
MonotonicityNot monotonic
2021-10-03T21:29:14.787854image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
99939673
96.3%
3439
 
1.1%
6412
 
1.0%
4118
 
0.3%
964
 
0.2%
261
 
0.1%
760
 
0.1%
1258
 
0.1%
1052
 
0.1%
546
 
0.1%
Other values (17)205
 
0.5%
ValueCountFrequency (%)
015
 
< 0.1%
126
 
0.1%
261
 
0.1%
3439
1.1%
4118
 
0.3%
546
 
0.1%
6412
1.0%
760
 
0.1%
818
 
< 0.1%
964
 
0.2%
ValueCountFrequency (%)
99939673
96.3%
271
 
< 0.1%
261
 
< 0.1%
251
 
< 0.1%
223
 
< 0.1%
212
 
< 0.1%
201
 
< 0.1%
193
 
< 0.1%
187
 
< 0.1%
178
 
< 0.1%

n5
Real number (ℝ)

UNIQUE

Distinct41188
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-7.401282414 × 10-5
Minimum-4.354231256
Maximum4.547728948
Zeros0
Zeros (%)0.0%
Negative20560
Negative (%)49.9%
Memory size321.9 KiB
2021-10-03T21:29:14.909845image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum-4.354231256
5-th percentile-1.634697615
Q1-0.6797248422
median0.001356889153
Q30.6733795808
95-th percentile1.641891473
Maximum4.547728948
Range8.901960204
Interquartile range (IQR)1.353104423

Descriptive statistics

Standard deviation0.9970236956
Coefficient of variation (CV)-13470.95868
Kurtosis0.001034753214
Mean-7.401282414 × 10-5
Median Absolute Deviation (MAD)0.6763812753
Skewness-0.001647365724
Sum-3.0484402
Variance0.9940562496
MonotonicityNot monotonic
2021-10-03T21:29:15.033632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0.001771173041
 
< 0.1%
0.45092247241
 
< 0.1%
0.068796778521
 
< 0.1%
-0.39832560891
 
< 0.1%
-1.3015522371
 
< 0.1%
-0.37290849641
 
< 0.1%
-1.809994161
 
< 0.1%
2.2013835521
 
< 0.1%
0.80803060961
 
< 0.1%
-0.3141699511
 
< 0.1%
Other values (41178)41178
> 99.9%
ValueCountFrequency (%)
-4.3542312561
< 0.1%
-4.0066713581
< 0.1%
-3.9682929031
< 0.1%
-3.8615807841
< 0.1%
-3.8109566661
< 0.1%
-3.7741293791
< 0.1%
-3.5488222931
< 0.1%
-3.544636791
< 0.1%
-3.5397849751
< 0.1%
-3.456362411
< 0.1%
ValueCountFrequency (%)
4.5477289481
< 0.1%
3.6650931231
< 0.1%
3.622805551
< 0.1%
3.5798951691
< 0.1%
3.5687849891
< 0.1%
3.5509629341
< 0.1%
3.4954462351
< 0.1%
3.4844669721
< 0.1%
3.4072925241
< 0.1%
3.3946131661
< 0.1%

n6
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.1729629989
Minimum0
Maximum7
Zeros35563
Zeros (%)86.3%
Negative0
Negative (%)0.0%
Memory size321.9 KiB
2021-10-03T21:29:15.133836image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum7
Range7
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4949010798
Coefficient of variation (CV)2.861311858
Kurtosis20.10881622
Mean0.1729629989
Median Absolute Deviation (MAD)0
Skewness3.832042243
Sum7124
Variance0.2449270788
MonotonicityNot monotonic
2021-10-03T21:29:15.230022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
035563
86.3%
14561
 
11.1%
2754
 
1.8%
3216
 
0.5%
470
 
0.2%
518
 
< 0.1%
65
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
035563
86.3%
14561
 
11.1%
2754
 
1.8%
3216
 
0.5%
470
 
0.2%
518
 
< 0.1%
65
 
< 0.1%
71
 
< 0.1%
ValueCountFrequency (%)
71
 
< 0.1%
65
 
< 0.1%
518
 
< 0.1%
470
 
0.2%
3216
 
0.5%
2754
 
1.8%
14561
 
11.1%
035563
86.3%

school
Categorical

HIGH CORRELATION
MISSING

Distinct7
Distinct (%)< 0.1%
Missing1731
Missing (%)4.2%
Memory size321.9 KiB
5 - a lot
12168 
4 - average amount
9515 
3 - a bit more
6045 
5 - a decent amount
5243 
1 - almost none
4176 
Other values (2)
2310 

Length

Max length19
Median length15
Mean length14.30633348
Min length8

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row5 - a decent amount
2nd row5 - a lot
3rd row2 - a little bit
4th row5 - a lot
5th row5 - a lot

Common Values

ValueCountFrequency (%)
5 - a lot12168
29.5%
4 - average amount9515
23.1%
3 - a bit more6045
14.7%
5 - a decent amount5243
12.7%
1 - almost none4176
 
10.1%
2 - a little bit2292
 
5.6%
0 - none18
 
< 0.1%
(Missing)1731
 
4.2%

Length

2021-10-03T21:29:15.335348image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-10-03T21:29:15.405497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
39457
23.0%
a25748
15.0%
517411
10.2%
amount14758
 
8.6%
lot12168
 
7.1%
49515
 
5.6%
average9515
 
5.6%
bit8337
 
4.9%
36045
 
3.5%
more6045
 
3.5%
Other values (7)22391
13.1%

successful_sell
Boolean

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size40.3 KiB
False
36548 
True
4640 
ValueCountFrequency (%)
False36548
88.7%
True4640
 
11.3%
2021-10-03T21:29:15.475883image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Interactions

2021-10-03T21:29:08.421071image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:52.822817image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.530816image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.922181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.248804image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.867063image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.570370image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.985309image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.276176image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.753020image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.460697image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:08.602895image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.039896image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.669645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.041080image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.390849image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.138116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.709979image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.106619image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.519373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.887034image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.695692image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:08.710769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.183121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.782247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.149796image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.511942image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.328860image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.832577image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.217451image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.647632image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.006642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.909209image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:08.976684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.320301image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.899776image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.338382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.641299image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.462043image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.955737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.331330image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.766314image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.122885image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.070453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.088145image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.484396image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.025835image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.448232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.770658image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.579899image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.067438image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.454299image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.886352image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.255560image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.210945image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.218694image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.619390image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.154279image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.554473image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.905373image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.788493image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.180004image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.568818image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.996611image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.399586image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.363700image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.348811image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.825948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.275704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.663226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.028569image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:59.915820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.312693image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.690187image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.110479image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.535784image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.523246image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.464471image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:53.968567image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.405662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.779232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.163749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.040434image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.444737image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.803451image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.250968image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:05.843078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.699422image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.588185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.106894image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.550779image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:56.903547image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.298809image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.162135image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.593389image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:02.920705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.383782image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.034539image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.835642image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.705537image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.246487image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.673221image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.015868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.459078image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.305233image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.730636image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.041833image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.515291image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.180831image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:07.983214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:09.821444image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:54.380437image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:55.791352image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:57.122568image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:28:58.643388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:00.429685image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:01.865247image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:03.150509image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:04.631147image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:06.315208image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-10-03T21:29:08.152281image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-10-03T21:29:15.551955image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-10-03T21:29:15.714388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-10-03T21:29:15.870673image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-10-03T21:29:16.015354image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2021-10-03T21:29:16.180637image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2021-10-03T21:29:10.090456image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-10-03T21:29:10.545511image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-10-03T21:29:10.867503image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-10-03T21:29:11.010970image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

ageb1b2c10c3c4c8dowemploymenti1i2i3i4i5marriage-statusmonthn2n3n4n5n6schoolsuccessful_sell
034yesnonoFalsenewNaNfrimanagement-1.893.075-47.11.4055099.1divorcedapr25309990.00177105 - a decent amountno
128yesnoyesFalsenewNaNthuassistant-1.892.893-46.21.3275099.1divorcedmay1750999-1.67315205 - a lotyes
255nononounknownnewNaNtueleisure1.493.918-42.74.9625228.1marriedjul36009990.92794602 - a little bitno
347yesnonoFalsenewNaNmonassistant-0.193.200-42.04.1915195.8marriednov18609990.20301305 - a lotno
449nononounknownnewNaNtueassistant1.493.918-42.74.9615228.1marriedjul66209990.99080405 - a lotno
548nononoFalsenewNaNtueengineer-0.193.200-42.04.1535195.8marriednov4620999-0.76665805 - a lotno
630nononounknownnewNaNfricustomer service-1.892.893-46.21.3135099.1singlemay2530999-0.55262804 - average amountno
732yesyesnoFalsenewNaNthumanagement-2.992.963-40.81.2605076.2marriedjun2900999-0.15961704 - average amountno
853yesyesnoFalsenewNaNthumanagement-1.893.749-34.60.6405008.7divorcedapr66609990.14067805 - a lotno
930nononoFalseoldNaNmonlaborer1.193.994-36.44.8575191.0marriedmay46509990.57052804 - average amountno

Last rows

ageb1b2c10c3c4c8dowemploymenti1i2i3i4i5marriage-statusmonthn2n3n4n5n6schoolsuccessful_sell
4117880noyesnoFalsenewnofrileisure-3.092.713-33.00.7185023.5divorceddec59109990.54511311 - almost noneno
4117933noyesyesFalsenewnowedlaborer-1.892.893-46.21.2815099.1marriedmay2840999-0.41759111 - almost noneyes
4118024yesnonoFalsenewNaNtueengineer-2.992.963-40.81.2625076.2singlejun27809990.77480804 - average amountno
4118132yesyesnoFalseoldNaNtuemanagement-0.193.200-42.04.7005195.8marriednov1530999-0.63033305 - a lotno
4118247yesnonoFalsenewNaNmoncustomer service1.493.918-42.74.9625228.1marriedjul106709991.52786802 - a little bitno
4118333yesnonoFalseoldNaNmonassistant1.494.465-41.84.8655228.1marriedjun3620999-0.05002204 - average amountno
4118436yesnonoFalseoldNaNmonengineer1.494.465-41.84.9615228.1marriedjun1650999-2.31050405 - a lotno
4118536nononoFalsenewNaNmonengineer1.493.918-42.74.9625228.1divorcedjul36209992.14423805 - a decent amountno
4118650nononoFalseoldNaNfrihobbyist1.494.465-41.84.9595228.1marriedjun28809990.35914401 - almost noneno
4118728yesnonoFalseoldNaNtuelaborer1.193.994-36.44.8575191.0marriedmay25609992.31313002 - a little bitno